Identification of eukaryotic internal ribosome entry site (ires) elements

ABSTRACT

A method of search for and identification of a eukaryotic IRES element active in cap-independent translation of RNA in eukaryiotic cells is provided, comprising the following steps: i)screening eukaryotic mRNA sequences or corresponding DNA sequences for a potential IRES element having a block of nucleotides having: a) a length of at least 30 nucleotides; b) an adenine nucleotide content of at least 40 mol-%; and c) a pyrimidine nucleotide content of less than 40 mol-%; ii) inserting said potential IRES element into a linear dicistronic construct between an upstream gene and a downstream GUS reporter gene, whereby said potential IRES element is positioned for IRES-dependent translation of said downstream GUS gene and whereby said upstream gene is preceded by a stable hairpin structure to prevent IRES-independent translation of said genes; and iii) testing said potential IRES element for IRES-dependent translation of said GUS gene in a rabbit reticulocyte lysate or in a wheat germ extract in vitro translation assay, whereby GUS gene expression is quantitated preferably relative to a construct having a reference IRES element or a non-IRES element between said upstream gene and said GUS gene.

FIELD OF INVENTION

The present invention is directed to a method of search and identification novel eukaryotic IRES elements active in plant, mammalian and yeast cells and to a process of expressing a nucleotide sequence of interest in eukaryotic cells cap-independently using an IRES element identified or identifyable according to the above method. The invention further relates to IRES elements identified or identifyable according to the above method and to transgenic or transiently transfected eukaryotic cells transformed with a vector containing such an IRES element.

BACKGROUND OF THE INVENTION

There is a growing interest in using internal ribosome entry site (IRES) elements for cap-independent expression of foreign genes in eukaryotic cells. Although the number of published nucleotide sequences shown to promote cap-independent translation is increasing rapidly, identification of new IRESes published so far was occasional and accidental and there is no distinct methodology of prediction of IRES activity.

Translation initiation of mRNAs in eukaryotic cells is a complex process that involves the concerted interaction of numerous factors (Pain (1996) Eur.J.Biochem. 236, 747-771). For most mRNAs, the first step is the recruitment of ribosomal 40S subunits onto the mRNA at or near the capped 5′end (FIG. 1). Association of 40S to mRNA is greatly facilitated by the cap-binding protein complex eIF4F. Factor eIF4F is composed of three subunits: the RNA helicase eIF4A, the cap-binding protein eIF4E, and the multi-adaptor protein eIF4G which acts as a scaffold for the proteins in the complex and has binding sites for eIF4E, eIF4a, eIF3, and poly(A) binding protein.

Infection of cells by a variety of RNA viruses results in the selective inhibition of translation of host but not of viral mRNAs. For example, infection of cells with poliovirus, a cytoplasmic RNA virus, results in the modification of several translation initiation factors. Specifically, the proteolysis of both forms of eIF4G, eIF4GI and eIF4GII (Gradi et al., (1998) Proc. Natl. Acad. Sci. USA 95,11089-11094) by virally encoded proteases results in inhibition of translation of most capped cellular mRNAs. In contrast, the translation of polioviral mRNA, which contains a 450-nt sequence in the viral 5′ noncoding region (5′NCR) that can recruit 40S subunits in the absence of intact eIF4F, is not inhibited. This sequence element was termed an <<internal ribosome entry site>> or <<IRES>> (Jang et al., 1988. J.Virol. 62, 2363-2643). Such IRES elements have been found in picornaviral, flaviviral, pestiviral, retroviral, lentiviral and insect viral RNAs (Table 1) and animal cellular RNAs (Table 2). IRES containing animal mRNAs can presumably recruit 40S ribosomes both via either their capped 5′ ends or their IRES elements that is likely to make possible the translation under conditions where cap-dependent translation is reduced, for example, during viral infection, at the G2/M phase of cell cycle, and under conditions of stress and apoptosis (Johannes et al., (1999) Proc. Natl. Acad. Sci. USA 96,13118-13123; Cornelis et a., (2000) Molecular Cell 5, 597-605; Pyronnet et al., (2000) Molecular Cell 5, 607-616; Stein et al., 1998. Mol. and Cell. Biol. 18, 3112-3119; Holcik et al., 2000. Oncogene 19, 4174-4177; Stoneley et al., 2000. Mol. and Cell. Biol. 20, 1162-1169). Up to 3% of cellular mRNAs of animal are translated at reduced concentrations of cap binding complex eIF4F (Johannes et al., (1999) Proc. Natl. Acad. Sci. USA 96,13118-13123).

Over the past few years a reconstituted ribosome binding assay has allowed for the elucidation of the mechanisms by which various IRES elements work (Pestova et al., 1998. Genes Dev. 12, 67-83). Some of these elements act by providing a high-affinity binding site for the RNA binding surface on eIF4G. Others work by binding to eIF3 and/or the 40S subunit (see FIG. 2). Recently, an important role of direct IRES RNA/18S rRNA interaction has been shown. The Gtx IRES contains several nonoverlapping segments having complementarity to 18S rRNA that were shown to mediate internal initiation of translation (Hu et al., 1999. Proc. Natl. Acad. Sci. USA 96,1339-1344). Within one of these segments, a 9-nt GC-rich sequence CCGGCGGGU which is 100% complementary to 18S rRNA at nucleotides 1132-1124 was identified. It was shown that synthetic IRESes composed of multiple linked copies of this 9-nt IRES module increased internal initiation dramatically in animal cells (Chappel et al., 2000. Proc. Natl. Acad. Sci. USA 97, 1536-1541).

5′ leaders of several plant viral polycistronic genomic RNAs, including members of the potyviral, comoviral families, are responsible for conferring cap-independent translation. Tobamoviruses and potexvirus X IRESes are located in internal parts of the viral genome (Table 2). Tobacco mosaic tobamovirus (TMV) is a positive-stranded RNA plant virus with a monopartite genome 6395 nucleotides (nt) in length (Goelet et al., 1982. Proc. Natl. Acad. Sci. USA 79, 5818-5822). The 5′ proximal ORFs encoding replicative proteins are expressed directly from the genomic RNA, with the smaller (126 kDa) protein produced approximately >10 times the level of the 183 kDa protein which is produced by occasional readthrough of the stop codon for the 126-kDa ORF (Pelham, 1978, Nature 272, 469-471). Although some replication can occur with only the larger protein, both proteins are required for efficient replication (Ishikawa et al., 1986. Nucl. Acid. Res. 14, 8291-8305). The remaining TMV gene products, the movement protein (MP) and the coat protein (CP), are expressed from 3′ coterminal subgenomic mRNAs (sgRNAs) (reviewed by Palukaitis and Zaitlin, 1986 In: “The plant virus”. M. H. V. van Regenmortel and M. Fraenkel-Conrat, Eds. Vol.2, pp. 105-131. Plenum Press, New York). Thus, the internal movement protein (MP) gene and the 3′-proximal coat protein gene cannot be translated from genomic RNA of typical tobamoviruses (TMV UI is the type member of the genus Tobamovirus). The dicistronic intermediate-length RNA-2 called sgRNA I₂ RNA is translated to produce the 30-kDa MP (Bruening et al., 1976 Virology 71, 498-517; Higgins et al., 1976 Virology 71, 486-497; Beachy and Zaitlin, 1977 Virology 81, 160-169; Goelet and Karn, 1982 J.Mol.Biol. 154, 541-550), whereas the 3′-proximal coat protein (CP) gene of I₂ RNA is translationally silent. This gene is expressed only from small monocistronic sgRNA (Siegel et al., 1976 Virology 73, 363-371; Beachy and Zaitlin, 1977 Virology 81, 160-169).

It has been shown (Ivanov et al., 1997, Virology 232, 32-43) that, unlike RNA of typical tobamoviruses, the translation of the CP gene of a crucifer-infecting tobamovirus (crTMV) occurred in vitro by an internal ribosome entry mechanism. The genome of crTMV (6312 nts) contains four traditional genes encoding two components of the replicase (the proteins of 122-kDa and 178-kDa, the readthrough product of 122-kDa), 30-kDa MP and 17-kDa CP (Dorokhov et al., 1993 DokI. Russian Acad Sci. 332, 518-52; Dorokhov et al., 1994 FEBS Lett. 350, 5-8). It was found that the 148-nt region upstream of the CP gene of crTMV RNA contained an internal ribosome entry site (IRES_(CP148) ^(CR)), promoting internal translation initiation of the CP gene and of different reporter genes (Ivanov et al., (1997) Virology 232, 32-43). By analogy with crTMV, the 3′-proximal CP gene of potato virus X occurs by a mechanism of internal initiation (Hefferon et al., 1997 J. Gen. Virol. 78, 3051-3059; Hefferon et al., 2000.Arch.Virol. 145, 945-956). The capacity of crTMV IRES^(CR) _(CP) for mediating internal translation distinguishes this tobamovirus from well-known type member of the genus, TMV U1. The equivalent 148-nt sequence from TMV U1 RNA was incapable (UI_(CP′148) ^(SP)) of mediating internal translation (Ivanov et al., (1997) Virology 232, 32-43).

Recently, it has been shown that the 228- and 75-nt regions upstream of the MP gene of crTMV and TMV U1 RNAs contained IRES elements, IRES_(MP,75) ^(CR) or IRES_(MP,228) ^(CR), which directed expression of the 3′-proximal reporter genes from dicistronic constructs in cell-free translation systems and in isolated protoplasts (Skulachev et al., 1999, Virology 263, 139-154). Moreover, the equivalent sequence from TMV U1 RNA used as the intercistronic spacer (IRES_(MP,75) ^(U1)) was able to mediate internal translation of the second gene in dicistronic transcripts.

There are several inventions wherein an IRES element was used for cap-independent expression of foreign gene(s) in linear multicistronic mRNAs via IRES elements in mammalian cells (U.S. Pat. No. 6,060,273; U.S. Pat. No. 6,114,146; U.S. Pat. No. 5,358,856; U.S. Pat. No. 6,096,505; U.S. Pat. No. 171,821; U.S. Pat. No. 5,766,903), plant cells (WO98/54342) and generally, in eukaryotic cells (U.S. Pat. No. 171,821; U.S. Pat. No. 5,766,903; U.S. Pat. No. 5,925,565; U.S. Pat. No. 6,114,146). To provide cap-independent IRES-mediated expression of a gene, a circular RNA was developed as well (U.S. Pat. No. 5,766,903). Cap-independent translation of eukaryotic mRNA could be reached by using 5′UTR of barley yellow dwarf virus RNA that is principally different from known IRESes (U.S. Pat. No. 5,910,628). Generally, all inventions used natural IRESes isolated from animal (e.g. U.S. Pat. No. 5,358,856) or plant viruses (WO98/54342) not having cross-kingdom activity, i.e. these IRESes are limited to either plant or animal cells. There are no inventions developing approaches for the creation of non-natural, artificial IRESes that are capable to provide efficient cap-independent gene expression in animal and plant cells. Moreover, there are no approaches for searching new IRES elements having cross-kingdom activity.

In contrast to animal cell mRNA, there are no published reports on IRES-mediated mRNA translation initiation in plant cells (see Table 2), except for one recent patent application (WO01/59138) describing a plant IRES element of the Arabidopsis RPS18C gene. However, neither the methodology used in this patent application nor the experimental approaches nor the interpretation of the results allows a direct and unambiguous conclusion of an IRES activity in vivo, nor a reliable detection of said elements in plant transcripts. First of all, there was an incomprehensive test for IRES activity. Usually, for detecting IRES activity, the dicistronic mRNA assay is utilised (Pelletier & Sonenberg, 1988, Nature, 334, 320-325). In this test, two types of capped bicistronic mRNAs separated by a putative IRES element (with and without hairpin structure in front of the first cistron) are analysed for the ability to provide expression in vitro and in vivo. The construct without hairpin structure allows ribosome scanning and translation of the first, 5′proximal cistron, whereas cap-dependent translation of the first cistron is blocked in the construct with hairpin structure. The authors of WO01/59138 tested the putative IRES element of RPS18C only in the artificial bicistronic transcript without hairpin structure. Thus, the results presented in WO01/59138 may well be a consequence of reinitiation of translation (Kozak, 2001, Mol. Cell. Biol 21, 1899-1907). Additionally, it is known that a nucleotide sequence having IRES activity in vitro, frequently does not show IRES activity in eukaryotic cells. WO01/59138 does not contain experimental evidence showing directly and unambiguously that the putative IRES element of Arabidopsis RPS18C is functional in eukaryotic cells (plant or animal).

In contrast to animal cell mRNA, there are no published reports of in vivo cellular mRNA IRES mediated translation initiation in plant cells (Table 2). The low activity of animal virus IRESes (encephalomyocarditis virus IRES, IRES_(EMCV)) in plants was reported (Urwin et al., 2000 Plant J. 24, 583-589). There is no evidence of cross-kingdom (plant, animal, yeast) activity of any IRES element so far. Although the number of published nucleotide sequences that are capable to provide cap-independent translation increases constantly, identification of new IRESes happens only accidentally, and there is no distinct methodology of prediction.

It is therefore an object of the invention to provide a method for identifying novel eukaryotic IRES elements.

It is a further object of the invention to provide novel eukaryotic IRES elements, notably of plant origin.

It is a further object to provide novel IRES elements having cross-kingdom activity.

It is another object of the invention to provide a process of expressing a nucleotide sequence of interest in eukaryotic cell(s) under translational control of a novel IRES element, notably of an IRES element of plant origin.

DESCRIPTION OF THE INVENTION

The above objects are solved according to the claims of the invention. The invention provides a method of search for and identification of a eukaryotic IRES element active in cap-independent translation of RNA in eukaryotic cells, comprising the following steps:

-   -   (i) screening eukaryotic mRNA sequences or corresponding DNA         sequences for a 30 potential IRES element having a block of         nucleotides having         -   (a) a length of at least 30 nucleotides;         -   (b) an adenine nucleotide content of at least 40 mol-%; and         -   (c) a pyrimidine nucleotide content of less than 40 mol-%;     -   (ii) inserting said potential IRES element into a linear         dicistronic construct between an upstream gene and a downstream         GUS reporter gene, whereby said potential IRES element is         positioned for IRES-dependent translation of said downstream GUS         gene and whereby said upstream gene is preceded by a stable         hairpin structure to prevent IRES-independent translation of         said genes; and     -   (iii) testing said potential IRES element for IRES-dependent         translation of said GUS gene in a rabbit reticulocyte lysate or         in a wheat germ extract in vitro translation assay, whereby GUS         gene expression is quantitated preferably relative to a         construct having a reference IRES element or a non-IRES element         between said upstream gene and said GUS gene.

An IRES element which gives rise to GUS expression in at least one of said in vitro translation assays may then be selected.

The inventors of the present invention have surprisingly identified criteria for identifying nucleotide sequences which exhibit IRES activity (IRES elements), i.e. sequences which are capable of providing cap-independent translation of a downstream gene or coding sequence by an internal ribosome entry mechanism. The inventors have found that nucleotide sequences having a block of at least 30 nucleotides with a high adenine content and a low pyrimidine nucleotide content have a high propensity of exhibiting IRES activity. By screening known nucleotide sequences applying the above criteria, a nucleotide sequence or a set of nucleotide sequences with a high propensity of having IRES activity can be identified. The screening of the invention constitutes a pre-selection of nucleotide sequences from all known nucleotide sequences. A pre-selected potential IRES element or a set of such potential IRES elements can then be tested experimentally for its actual (degree of) IRES activity. The screening or pre-selection according to the invention reduces the number of sequences to be tested experimentally for IRES activity enormously such that directed identification of novel IRES elements including experimental confirmation becomes possible for the first time.

Said screening may be carried out on any known nucleotide sequence and on sequences that will become known in the future. Nucleotide sequences of eukaryotic origin, i.e. plant and animal sequences are screened and those of higher plants or higher animals are more preferred. Screening of viral sequences is not within the scope of the invention.

Whole genome sequences including nuclear genomes and organelle genomes like plastid or mitochondrial genomes may be screened. Eukaryotic nuclear genome sequences are preferred. Screening may be carried out on DNA or on RNA sequences. If double-stranded DNA is used, both strands may be screened. The coding strand is screened preferentially. The screening may be restricted to 5′ UTR sequences of genes. It is equivalent to screen 5′ UTR sequences on the mRNA level. The screening criteria of the invention regarding adenine and pyrimidine content refer to messenger RNA or to the corresponding coding strand on the DNA level.

Screening may be carried out, in the simplest case, by eye by scanning along printed or written nucleotide sequences. This approach can be successful, especially if one focuses on 5′ UTR sequences. It is more convenient to employ an automatic screening method e.g. by using a computer and a suitable computer program. In this way, large data bases of nucleotide sequences, notably genome databases, may be screened with the potential of finding many IRES elements.

Herein, “adenine-rich” or “high adenine content” means a content of adenine that is at least 40 mol-%. “Pyrimidine-poor” or “low pyrimidine content” means a content of thymine (uracil)+cytidin that is lower than 40 mol-% thymine (uracil)+cytidin. As to the criteria applied during screening, a block of at least 30 nucleotides with an adenine content of at least 40, preferably at least 50 and most preferably at least 60 mol-% is searched for. The pyrimidine content should be less than 40 mol-%, preferably less than 30 mol-%, and most preferably less than 20 mol-%.

There is no strict upper limit for the length of said block of nucleotides. For practical purposes, said block is chosen to be shorter than 500 nucleotides during screening. Said block has a length of at least 30 nucleotides. It is preferred to search for blocks between 200 and 30 nucleotides. More preferably, said block has between 40 and 100 nucleotides.

Said block of nucleotides according to the invention has a high propensity of conferring IRES activity to a sequence comprising it. When screening 5′UTR sequences, 2, 3 or even more blocks of nucleotides according to the invention may be found.

If screening is carried out with the help of a computer, it may be done multiple times, whereby the screening criteria may be changed each time. Preferably, one starts with strict criteria, i.e. high adenine content, low pyrimidine content and short nucleotide blocks in order to find sequences with the highest probability of having IRES activity. Further IRES elements may be found by applying less stringent criteria with lower adenine content, higher pyrimidine content, or longer blocks or combinations thereof, within the criteria given above. In this way, IRES elements of various IRES activities may be found, which is useful for achieving a desired expression level when a gene of interest is expressed under translational control of an IRES element.

FIG. 7 and FIG. 8 show several 5′ UTR sequences containing nucleotide blocks according to the invention of human and plant origin, respectively. These 5′ UTR sequences are potential IRES elements according to step (i) of claim 1. For some of these potential IRES elements cross-kingdom IRES activity is demonstrated in the examples.

A potential IRES element found in step (i) of claim 1 is then subjected to an experimental confirmation of its IRES activity. For this purpose, a test system was deviced comprising the bicistronic DNA construct “H-GFP-GUS” having the following elements in this order: a structure that forms a stable hairpin (H) on the mRNA level, a green fluorescent protein (GFP) gene (GFP coding sequence), an intercistronic spacer with restriction site(s) for inserting potential IRES elements, and the GUS gene (GUS coding sequence). A potential IRES element is inserted into the spacer between GUS and GFP. This construct is then transcribed in vitro using e.g. T7 RNA polymerase to obtain mRNA. The obtained mRNA is then translated in vitro using a rabbit reticulocyte lysate (RRL) or a wheat germ extract (WGE) in vitro translation system. Both in vitro translation systems are commercially available e.g. from Promega and from Roche Diagnostics and may be used according to the manufacturer's instructions. After translation, GUS expression may be determined e.g. via its enzymatic activity and a calorimetric detection, by autoradiography, or by Western blotting.

GUS gene expression is quantitated preferably relative to a construct having a reference IRES element or a non-IRES element between said upstream gene and said GUS gene. As a reference IRES element having strong IRES activity the nucleotide block (GAAA)₁₆ (see examples) or any other known IRES element may be used. As a non-IRES element, said synthetic spacer referred to in FIG. 4 (see examples) may e.g. be used or a random nucleotide block.

Said hairpin structure in said bicistronic DNA construct prevents cap-dependent translation, such that all GUS expression can be ascribed to the translational activity of the potential IRES element inserted in said intercistronic spacer of said construct. Said hairpin has to be stable enough to efficiently prevent cap-dependent translation. Preferably, its stability is higher than 30 kcal/mol (see Kozak, M. (1986) Proc. Natl. Acad. Sci. USA 83, 2850-2850). Insufficient stability of said hairpin may be recognized by any expression of said GFP gene. GFP translation may be detected e.g. by way of its fluorescence, by Western blotting or by autoradiography.

The H-GFP-GUS construct used herein was built from plasmid pBluescriptII SK+, a GUS nucleotide sequence and a GFP sequence. The hairpin structure has the sequence: ggtaccgggccccccctcgaggtcgacggtatcgataccgtcgacctcgagggggggcccggtacc. Equivalent structures can be easily created by a person skilled in the art.

All aspects in connection with these in vitro translation systems are well studied and known in the prior art. Details can be found in the following documents and in references cited therein: Anderson, C., et al. (1985) Meth. Enzymol. 101,635; Krieg, P. and Melton, D. (1984) Nucl. Acids Res. 12, 7057; King, R. W. et al.(1997) Science 277, 973; DiDonato, J. A. and Karin, M. (1993). Promega Notes 42, 18; Pelham, H. R. B. and Jackson, R. J. (1976) Eur. J. Biochem. 67, 247; Jackson, R. J. and Hunt, T. (1983) Meth. Enzymol. 96, 50; Technical Bulletins No. 126, and No.165 or Technical Manual No.232, all from Promega Corp.

Potential IRES elements which give rise to GUS gene expression in such a WGE or RRL translation assay are IRES elements according to the invention. The IRES elements identified or identifyable according to the invention typically exhibit cross-kingdom activity, i.e. they can be used to express a gene of interest under translational control of said IRES in plants and in animals. In spite of said cross-kingdom activity, the activity of said IRES element is normally not the same when expression of a gene of interest is compared in plant or animal systems. Variations in expressions levels exist between the two in vitro translation systems mentioned above. In in vivo systems, these variations are in general even higher. Still, the IRES elements identified in this invention show surprisingly high IRES activity in both in vitro systems, in plant cells and in animal cells.

It should be mentioned that IRES activity may be detected not only by the above in vitro translation assays but also in plant or animal cells or in vivo. Series of synthetic sequences which were used as intercistronic spacers in the bicistronic H-GFP-GUS vectors were created and examined in RRL, tobacco protoplasts and HeLa cells. Further, we created two synthetic sequences representing four linked copies of a 19-nt direct repeat (Ecp ×4) and 16 copies of the GAAA sequence (GAAA)₁₆ which both turned out to function as IRES elements. The important role of adenine-rich nucleotide sequences was proven in plant and animal translation systems in vitro and in vivo.

The use of the principle of the invention allowed to reveal numerous plant eukaryotic IRES elements that are universal and are efficient translation initiation motifs for different kingdoms of living organisms. To the best of our knowledge, we have identified the first IRES elements in genomes of plant organisms.

This invention further provides a process of expressing a nucleotide sequence of interest in eukaryotic cell(s) by introducing into said cell(s) a vector comprising said nucleotide sequence of interest operably linked to an upstream IRES element identified or identifyable in accordance with the above method of search for and identification of a eukaryotic IRES element, whereby said nucleotide sequence of interest is translated cap-independently by way of said IRES element.

Said nucleotide sequence of interest may be expressed in a plant or in plant cells. It may also be expressed in an animal or in animal cells. Among animals, mammalian cells or mammalian animals including humans (e.g. for gene therapy) are prefered. Further, said nucleotide sequence of interest may be expressed in fungi, preferably in yeast cells.

The IRES element contained in said vector is of eukaryotic origin, preferably of plant or of mammalian origin. More preferably, said IRES element comprises a sequence according to one of the sequences of FIG. 7 or FIG. 8 or an IRES-functional portion thereof. Even more preferably, said IRES element is derived from the 5′UTR of one of the following plant genes: heat shock factor-1, poly(A) binding protein, 48K MAP kinase. Most preferably, these genes are from tobacco.

Said process of expressing of the invention does not comprise use of one of the sequences shown in FIG. 11 as IRES element.

Said nucleotide sequence of interest may be expressed from monocistronic mRNA. However, the high potential of IRES technology is in bicistronic or polycistronic expression, which allows to express subunits of a protein complex or engineering a whole biochemical cascade or pathway.

Said eukaryotic cells or eukaryotic organisms may be stably transformed or transiently transfected with said vector. Methods of transforming or transfecting animal or plant cells are well-known in the art. Various methods can be used to deliver DNA or RNA vector into the plant cell, including direct introduction of said vector into a plant cell by means of microprojectile bombardment, electroporation or PEG-mediated treatment of protoplasts (for review see: Gelvin, S. B., 1998, Curr. Opin. Biotechnol., 9, 227-232; Hansen & Wright, 1999, Trends Plant Sci., 4, 226-231). Plant RNA and DNA viruses are also efficient delivery systems (Hayes et al.,1988, Nature, 334, 179-182; Palmer et al., 1999, Arch. Virol., 144 1345-1360; Lindbo et al., 2001, Curr. Opin. Plant. Biol., 4, 181-185). Said vectors can deliver a transgene either for stable integration into the genome/plastome of the plant (direct or Agrobacterium-mediated DNA integration) or for transient expression of the transgene (“agroinfiltration”). Similarly, animal cells may be electroporated, infected by viral vectors or transfected using Lipofectin.

Construction of the vectors for the expression process of the invention may be done according to standard procedures of molecular biology. Specific embodiments are outlined in the figures and in the examples section.

The invention comprises IRES elements identified or identifyable by the method of the invention. Particularly, the invention comprises an IRES element having a sequence of one of the sequences of FIG. 7 or FIG. 8 or an IRES-functional portion thereof.

The invention discloses for the first time IRES elements of plant origin. Such IRES elements are therefore comprised by the invention as are vectors containing such IRES elements. Preferably, an IRES contained in leaders of the Arabidopsis RPS18 gene family, notably in the leader of the RPS18C gene, or in leaders of genes homologous or orthologous thereto is excluded from the IRES elements of this invention.

Moreover, the invention comprises transgenic or transiently transfected eukaryotic cells transformed or transfected with a vector containing an IRES element according to the invention.

Furthermore, an RNA vector containing novel IRES RNA sequences, as well as a DNA vector containing DNA copies of novel IRES DNA sequences are described.

The invention is useful, as it allows one skilled in art to identify previously unknown natural translational elements with IRES activity. It allows to identify IRES elements that are more active than previously described ones. In addition, it allows the identification of IRES elements that are universal and active across different taxonomic kingdoms.

An essential advantage of the present invention is the possibility to express more then two genes in multicistronic cassettes in plant cells transiently or stably transformed (transgenic plants).

Another advantage of the present invention is the possibility to express more than two genes in multicistronic cassettes in human and mammalian cells transiently and stably transformed (transgenic animals).

Another advantage of the present invention is the possibility to express more than two genes in multicistronic cassettes in yeast cells.

A further advantage provided by the present invention is the possibility to construct virus vectors for expressing foreign genes via the novel IRESes of the invention in mammals and especially in humans.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1. Simplified model of canonical eukaryotic cap-dependent translation initiation. The eIF4E-eIF4G interaction targets the small ribosomal subunit to the 5′end of the mRNA. The eIF4G also interacts with Pab1p, eIF3, and the RNA helicase eIF4A to mediate the initiation process. ORF: open reading frame; PABP: poly (A)-binding protein.

FIG. 2. Simplified model of the mechanism of ribosome recruitment to mRNA during hepatitis C virus IRES-mediated translation initiation. The IRES element bypasses the need for an eIF4E-eIF4G interaction by providing alternative means by which the ribosome is recruited to the mRNA. Arrows indicate the various direct interactions between IRES elements and the initiation 40S complex.

FIG. 3 depicts the nucleotide sequence of NtHSF-1 mRNA 5′leader (EMBL accession code AB014483) (A) and cloning steps (B). The initiation codon AUG is emphasised in italics.

FIG. 4 shows an autoradiograph of proteins tranlated in WGE by H-GFP-GUS containing (i) 36-nt artificial sequence (spacer) GCGUGGGCGGCGUGGGCGUUUGUUCUUUGUUUGACC, (ii) 453-nt NtHSF-1 mRNA 5′leader, (iii) (GAAA)₁₆and IREScp,148cR as intercistronic spacers. Arrows indicate the position of GUS and expected positions of GFP.

FIG. 5 shows a diagram of GUS gene expression in tobacco protoplasts transfected with 35S-based hairpin-less bicistronic GFP-GUS constructs containing the nucleotide sequences indicated under the diagram bars. GUS activity values of each transgenic plant were normalized to the GFP content in the same protoplast sample determined by densitometry of GFP bands from Western blots. At least three independent experiments were used for the calculation of average values and standard errors depicted as vertical bars.

FIG. 6 shows a diagram of GUS gene expression in HeLa cells transfected with bicistronic T7-based construct H-GFP-GUS. GUS values were normalised to protein contents of samples. At least three independent experiments were used for the calculation of average values and standard errors depicted as vertical bars.

FIG. 7 shows human 5′UTR nucleotide sequences as putative human IRESes.

FIG. 8 shows plant 5′UTR nucleotide sequences as putative plant IRESes.

FIG. 9 depicts experimental steps of identification and isolation of a novel plant IRES from the 5′UTR of tobacco polyadenine-binding protein PABP mRNA.

FIG. 10 depicts experimental steps of identification and isolation of novel plant IRES from the 5′ UTR of tobacco 48K MAP kinase.

FIG. 11 shows nucleotide sequences of known animal and viral IRES elements.

In the following, the invention will be further described using specific examples. Standard molecular biological techniques were carried out according to Sambrook et al (1989, Molecular Cloning: a Laboratory Manual. 2^(nd) edn. Cold Spring Harbor, N.Y.). All plasmids utilized in the invention can be prepared according to the directions of the specification by a person of ordinary skill in the art without undue experimentation employing materials that are readily available.

EXAMPLES

The goal of these example is to demonstrate approaches for revealing and creating earlier non-identified natural IRESes in 5′UTR eukaryotic mRNA sequences.

Example 1

Identification of a Novel IRES Sequence Isolated from NtHSF mRNA 5′UTR

Cloning Strategy

The 5′UTR of NtHSF-1 was selected by sreening. It contains adenine-rich nucleotide blocks. It was cloned as outlined in FIG. 3 to produce transcriptional vector pH-GFP-NtHSF-GUS.

In vitro translational assays: WGE and RRL translation assay were carried out as described in Technical Bulletin No. 165 and No. 126, respectively, from Promega Corporation using the coupled transcription/translation systems of catalogue numbers L4140 and L4610, respectively. Alternatively, a conventional RRL system from Promega, catalogue number L4960 (Technical Manual No. 232) was employed. Linear H-GFP-GUS vectors having potential IRES elements inserted between GFP and GUS were transcribed using the T7 promoter/RNA polymerase system. RNA transcripts were precipitated with LiCl, dissoved in water, reprecipitated with ethanol. RNA concentrations were measured by spectrophotometry and 5 μg of each transcript was taken for 25-μl in vitro translation sample GFP and GUS expression were detected by autoradiography.

The transient assay system from plant protoplasts. The following procedures of protoplast preparation and transfection were used: (i) The protoplasts were isolated from N. tabacum (cv. W38) leaves as described (Saalbach et al., 1996 Plant Physiol. 112, 975-985). Aliquotes of 4×10⁵ protoplasts were transfected with 30 μg of pFF19-based dicistronic DNA constructs “GFP-spacer-GUS” and incubated for 36 hours at 25° C. in the dark. GUS activity was measured as relative light units (RLU). GUS activity was determined according to (Jefferson 1987 Plant Mol. Biol. Rep. 5, 387-405) using MUG. For each experiment background GUS activity associated with non-transfected protoplasts was subtracted. Protein concentrations were estimated using the Bio-Rad protein assay kit based on the method of Bradford (1976 Anal. Biochem. 72, 248-254). GFP expression was detected by western-blot analysis using monoclonal mouse antibodies (Boehringer Mannheim No. 1814460) according to the manufacturer's manual. GFP amounts in western-blot bands were calculated using Bio-Rad Quality-One software.

Transfection of HeLa Cells Using Vaccinia Virus and T7 Promoter Containing Plasmids, Encoding for GUS

HeLa cell monolayers were grown on 3.5 cm Petri dishes in Dulbecco's modified minimal essential medium supplemented with 10% heat-inactivated fetal calf serum, and 100 units/ml streptomycin and penicillin. Virus stocks of modified vaccinia virus Ankara (MVA), expressing bacteriophage T7 RNA polymerase gene were made according to usual methods. HeLa cell dishes that were 80-90% confluent were infected with virus using 30-40 pfu/cell. After a 45 min absorption period the cells were washed and transfected using Opti-MEM (Life Technologies, Inc.) plasmid DNA and Lipofectin (life Technologies, Inc.). A transfection mixtures of 2 μg DNA in 5 μl Lipofectin was used for a 3.5 cm plate. For each construct, 6 plates were used in each experiment. Cells were incubated at 37° C. for 6 h. After incubation the media was removed, cells were washed twice with PBS and lysed directly on the plate in 250 μl lysis buffer (100 mM KHPO₃ pH7.8, 0.2% Triton X-100, 0.5 mM DTT) for 10 minutes. The lysate was collected, clarified by centrifugation at 2000 g for 10 minutes and stored at −70° C. GUS activity was detected in 20 μl of lysate using GUS Light™ reagent system (Tropix, Mass., USA) according to the manufacturer's protocols.

Results

The 453-nt 5′leader of Nicotiana tabacum heat shock factor 1 (NtHSF-1, EMBL nucleotide database accession code AB014483) was isolated from a tobacco cDNA bank and was used as an intercistronic spacer in bicistronic construct H-GFP-GUS which was tested in WGE and RRL in vitro translational systems (FIG. 4) and also in tobacco protoplasts (FIG. 5) and HeLa cells (FIG. 7). FIG. 4 shows the results of translation in WGE bicistronic construct H-GFP-GUS containing the NtHSF-15′ leader in comparison with the 36-nt artificial sequence GCGUGGGCGGCGUGGGCGUUUGUUCUUUGUUUGACC taken as a control, and (GAAA)₁₆ and IRES_(CP,148) ^(CR) taken as a positive control. It can be seen that the NtHSF-1 leader provided efficient expression of the GUS gene in WGE although expression of 5′-proximal gene (GFP) was blocked by a stable hairpin structure. Analogous results were obtained with the RRL system (data not shown). To confirm the results obtained in vitro, we tested the NtHSF-1 leader on IRES activity in plant tobacco protoplasts and human HeLa cells. The results presented in FIGS. 5 and 6, respectively, shows that the NtHSF-1 leader possessed IRES activity comparable to IRES_(CP,148) ^(CR), in both types of eukaryotic cells. Thus, it was concluded that the NtHSF-1 leader, like IRES_(CP,148) ^(CR), possesses cross-kingdom activity and can be considered as an IRES sequence (IRES^(NtHSF-1)).

The results demonstrate that application of the searching criterial of the present invention allows to identify IRES elements in 5′ leaders of a eukaryotic mRNA.

Example 2

Identification of a Novel IRES Sequence Isolated from Tobacco Poly(a) Binding Protein mRNA 5′UTR

An analysis of human and plant mRNA 5′UTR according to the criteria of the invention revealed other sequences with a high chance of having IRES activity (FIGS. 7 and 8). We selected from this list of sequences 5′UTR of tobacco poly(A)-binding protein (PABP) to confirm the efficiency of our methodology of prediction and identification of novel IRES sequences.

Cloning strategy: The approach for PABP IRES isolation is described in FIG. 9.

The obtained DNA constructs were linearized with NotI enzyme and transcribed with T7-polymerase. RNA transcripts were precipitated with LiCl, dissoved in water, and reprecipitated with ethanol. RNA concentrations were measured by spectrophotometry and 5 μg of each transcript was taken for 25-μl in vitro translation sample (RRL, Promega).

The results presented in Table 3 show that the PABP 5′UTR was able to direct efficient GUS gene expression in the RRL system. Additionally, PABP 5′UTR provided GUS gene expression in HeLa cells as well (Table 4).

Example 3

Identification of a Novel IRES Sequence Isolated from Tobacco 48K MAP Kinase 5′UTR mRNA

The goal of this example is to confirm the validity of our approach for searching novel plant IRESes. We selected from the list of putative IRESes (FIG. 8) the 5′UTR of tobacco 48K MAP kinase (MAPK) to confirm the efficiency of our methodology of prediction and identification of novel IRES sequences.

Cloning strategy: The approach for 48K-MAPK IRES isolation is described in FIG. 10.

The obtained DNA constructs were linearized with NotI enzyme and transcribed with T7-polymerase. RNA transcripts were precipitated with LiCl, dissoved in water, and reprecipitated with ethanol. RNA concentrations were measured by spectrophotometry and 5 μg of each transcript was taken for 25-μl in vitro translation sample (RRL, Promega).

The results presented in Tabl. 5 show that the 48K-MAPK 5′UTR is capable of directing efficient GUS gene expression in RRL. TABLE 1 Virus IRESes IRES Virus type Virus localization Reference Animal RNA Viruses Picornaviruses Poliovirus (PV) 5′UTR Pelletier & Sonenberg (1988). Nature 334, 320-325 Encephalomyokarditis virus (EMCV) 5′UTR Jang et al. (1988) J. Virol. 62, 2636-2643 Foot-and-mouth disease virus (FMDV) 5′UTR Kuhn et al. (1990) J. Virol. 64, 4625-4631 Flavivirus Hepatitis C virus 5′UTR Reynolds et al. (1995). EMBO J. 14, 6010-6020. Pestivirus Classical swine fever virus (CSFV) 5′UTR Pestova et al. (1998). Genes and Devel. 12, 67-83. Retrovirus Murine leukaemia virus (MLV) 5′UTR Berlioz & Darlix (1995). J. Virol. 69, 2214-2222 Lentivirus Simian immunodeficiency virus (SIV) 5′UTR Ohlmann et al., (2000). J. Biol. Chem. 275, 11899-11906. Insect RNA Cricket paralysis virus Two IRESes Wilson et al. (2000). virus (5′UTR and Mol. Cell. Biol. 20, internal) 4990-4999. Plautia stali intestine virus (PSIV) internal Sasaki & Nakashima (1999). J. Virol. 73, 1219-1226 Plant RNA Viruses Tobamoviruses Crucifer infecting tobamovirus Two IRESes Ivanov et al. (1997). (crTMV) (both internal) Virology 232, 32-43; Skulachev et al. (1999). Virology 263, 139-154. Tobacco mosaic virus (TMV) internal Skulachev et al. (1999). Virology 263, 139-154. Potexvirus Potato virus X (PVX) internal Hefferon et al. (1997). J. Gen. Virol. 78, 3051-3059. Potyvirus Tobacco etch virus (TEV) 5′UTR Carrington & Freed (1990). J. Virol. 64, 1590-1597 Turnip mosaic virus (TuMV) 5′UTR Basso et al. (1994). J. Gen. Virol. 75, 3157-3165. Comovirus Cowpea mosaic virus (CpMV) 5′UTR Verver et al. (1991). J. Gen. Virol. 72, 2339-2345.

TABLE 2 Eukaryotic cellular mRNA IRESs IRES Gene type Gene localization Reference Animal cellular mRNA Translation EIF4G 5′UTR Johannes & Sarnow intiation factors (1998). RNA 4, 1500-1513 DAP5 5′UTR Henis-Korenbit et al. (2000). Mol. Cell. Biol. 20, 496-506 Growth factors Vascular endothelial growth factor 5′UTR Huez et al. (VEGF) (1998) Mol. Cell. Biol. 18, 6178-6190 Fibroblast growth factor 2 (FGF-2) 5′UTR Creancier et al. (2000). J. Cell. Biol. 150, 275-281. Mnt 5′UTR Stoneley et al. (2001). Oncogene 20, 893-897. Platelet-derived growth factor B 5′UTR Bernstein et al. (PDGF B) (1997). J. Biol. Chem. 272, 9356-9362 Homeotic genes Antennapedia 5′UTR Oh et al. (1992). Genes & Devel. 6, 1643-1653. Survival X-linked inhibitor of apoptosis (XIAP) 5′UTR Holcik & Korneluk proteins (2000). Mol. Cell. Biol. 20, 4648-4657. Apaf-1 5′UTR Coldwell et al. (2000). Oncogene 19, 899-905. Miscellaneous BiP 5′UTR Macejak & Sarnow (1991). Nature 353, 90-94. Dendrin, neurogranin, α-subunit of the 5′UTR Pinkstaff et al. (2001). calcium-calmodulin-dependent kinase Proc. Natl. Acad. Sci. USA II, ARC, MAP2 98, 2770-2775 Connexin-32 5′UTR Plant cellular mRNA Heat shock Nicotiana tabacum heat shock factor 1 5′UTR This application protein (NtHSF-1) Translation Nicotiana tabacum Poly)A)-binding 5′UTR This application intiation factor protein (PABP)

TABLE 3 GUS gene expression in RRL directed by PABP 5′ UTR in bicistronic construct. 4MU fluoresc. 4MU fluoresc. Construct after 30′ (RLU) after 60′ (RLU) hGFP-PAB5′UTR-GUS 158233 271943 hGFP-IREScp148-GUS 208499 358621 NoRNA 643 674

TABLE 4 GUS expression in HeLa cells directed by PABP 5′ UTR (all activities in RLU). Construct GUS after 30′ GUS after 60′ HGFP-PABP 5′UTR- 9397 18639 GUS 8342 16765 10309 20707 HGFP-IREScp-GUS 237953 427810 232171 437064 257613 473244 GFP-IREScp-GUS 273731 508842 236659 445389 Mock (approx..) 400 650

TABLE 5 GUS gene expression in RRL directed by 48K MAPK 5′UTR in bicistronic construct. 4MU fluoresc. Construct after 30′ (RLU) hGFP-48K MAPK5′UTR-GUS 184235 hGFP-IREScp148-GUS 190210 NoRNA 1567 

1. A method of search for and identification of a eukaryotic IRES element active in cap-independent translation of RNA in eukaryotic cells, comprising the following steps: (i) screening eukaryotic mRNA sequences or corresponding DNA sequences for a potential IRES element having a block of nucleotides having (a) a length of at least 30 nucleotides; (b) an adenine nucleotide content of at least 40mol-%; and (c) a pyrimidine nucleotide content of less than 40mol-%; (ii) inserting said potential IRES element into a linear dicistronic construct between an upstream gene and a downstream GUS reporter gene, whereby said potential IRES element is positioned for IRES-dependent translation of said downstream GUS gene and whereby said upstream gene is preceded by a stable hairpin structure to prevent IRES-independent translation of said genes; and (iii) testing said potential IRES element for IRES-dependent translation of said GUS gene in a rabbit reticulocyte lysate or in a wheat germ extract in vitro translation assay, whereby GUS gene expression is quantitated preferably relative to a construct having a reference IRES element or a non-IRES element between said upstream gene and said GUS gene.
 2. The method of claim 1, wherein said block of nucleotides has an adenine nucleotide content of at least 50 mol-%.
 3. The method of claim 1, wherein said block of nucleotides has an adenine nucleotide content of at least 60 mol-%.
 4. The method of claim 1, wherein said block of nucleotides has a pyrimidine nucleotide content of less than 30 mol-%.
 5. The method of claim 1, wherein said block of nucleotides has a pyrimidine nucleotide content of less than 20 mol-%.
 6. The method of claim 1, wherein said block of nucleotides has a length of at least 40 nucleotides.
 7. The method of claim 1, wherein said block of nucleotides has a length between 30 and 200 nucleotides.
 8. The method of claim 1, wherein said block of nucleotides has a length between 40 and 100 nucleotides.
 9. The method of claim 1, wherein in step (i) genome sequences are screened.
 10. The method of claim 1, wherein 5′-untranslated regions of open reading frames are screened.
 11. The method of claim 1, wherein a computer program is used for carrying out step (i).
 12. A computer program for carrying out step (i) of the method according to claim
 1. 13. A process of expressing a nucleotide sequence of interest in eukaryotic cell(s) by introducing into said cell(s) a vector comprising said nucleotide sequence of interest operably linked to an upstream IRES element identified according to the method of claim 1, whereby said nucleotide sequence of interest is translated cap-independently by way of said IRES element.
 14. The process according to claim 13, wherein said nucleotide sequence of interest is expressed from bicistronic or polycistronic mRNA.
 15. The process according to claim 13, wherein the IRES element is of plant origin.
 16. The process according to claim 13, wherein the IRES element is of mammalian origin.
 17. The process according to claim 13, wherein said IRES element comprises a sequence according to one of the sequences of FIG. 7 or FIG. 8 and of IRES-functional portions of the sequences of FIG. 7 and FIG.
 8. 18. The process according to claim 13, wherein said IRES element is derived from the 5′UTR of one of the following plant genes: heat shock factor-1, poly (A) binding protein, 48K MAP kinase.
 19. The process according to claim 18, wherein said genes are tobacco genes or genes homologous or orthologous thereto.
 20. The process according to claim 13, wherein said eukaryotic cell is a plant cell.
 21. The process according to claim 13, wherein said eukaryotic cell is an animal cell.
 22. The process according to claim 13, wherein said eukaryotic cell is a yeast cell.
 23. An IRES element identified by the method according to claim
 1. 24. An IRES element having a sequence selected from the group comprising the sequences of FIG. 7 or FIG. 8 and IRES-functional portions of the sequences of FIG. 7 and FIG.
 8. 25. Use of a nucleotide sequence of plant origin as an IRES element.
 26. A vector comprising an IRES element according to claim
 23. 27. Transgenic or transiently transfected eukaryotic cells transformed or transfected with a vector as defined in claim
 13. 28. Transgenic of transiently transfected eukaryotic cells transformed or transfected with a vector according to claim
 26. 